QDM

The Complete Collection

Table of Contents


Process Simulation

Main Idea

Suppose you have a real world process whose efficiency you want to improve.

If there is uncertainty in this process (e.g., in the demand for a product, in the supply of some parts, in the time it takes to perform some of the work, in the quality of the work, etc.), then it is often difficult to predict the effects of making various changes to the process.

With process simulation, we create a model and then test the effects of changes made to the model. If your model provides an adequate reflection of reality, then our simulation can help to make decisions about changes to implement.

Process simulation can be used across many different domains: queueing systems, logistics, call centers, networks, manufacturing, health care, production, and inventory, just to name a few.

Queueing Theory

Exactly what it sounds like: it is mathematical study of lines. On its face, maybe not super exciting; however, we should consider all of the different things that tend to wait in lines.

Notation

Queues have their own notation (Kendall’s notation).

\(A/B/C/D\)

Where:

\(A = arrival\, process\)

\(B = service\, process\)

\(C = server\, number\)

\(D = que\, capacity\)

Below are some common ones:

\(M/D/k\)

\(M/M/k\)

M generally stands for Markov or Exponential

D is deterministic: all jobs require a fixed amount of time.

k is the number of servers/workers/etc.

Both of these are generally assumed to have an infinite buffer.

\(M/M/1/K\)

Here the K is specifying a buffer size.

If a queue is \(M/D/k\), we can easily compute some helpful statistics (we don’t need any fancy software to help us).

\(\lambda\) = arrival rate

\(\mu\) = service rate

\(\rho = \frac{\lambda}{\mu}\) = utilization

Average number of entities in the system is:

\[ L = \rho + \frac{1}{2}\Bigg(\frac{\rho^2}{1 - \rho}\Bigg) \]

Average number in queue:

\[ L_Q = \frac{1}{2}\Bigg(\frac{\rho^2}{1 - \rho}\Bigg) \]

Average system waiting time:

\[ \omega = \frac{1}{\mu}+\frac{\rho}{2\mu(1 - \rho)} \]

Average waiting time in queue:

\[ \omega_Q = \frac{\rho}{2\mu(1 - \rho)} \]

Markov Chains (an aside)

Much like flipping a log over will yield an assortment of creatures, peeling back the layers of many methods will reveal a Markov Chain. And just like those assorted critters, you don’t really know what you are looking at when you see them.

Here is a conceptual example of markov chains. This example is adapted from Richard McElreath’s excellent book, Statistical Rethinking.

You manage 10 teams, named from 1 to 10. Each team name is proportional to the number of people on the team and each team is on a separate floor in the building.

You need to visit a team everyday, proportional to the number of people on the team (i.e., you would visit team 10 more often than team 1).

At the end of the day, you randomly select whether a proposed move will take you up a floor or down a floor.

After randomly selecting up or down, you grab a number of pens that is equal to the number of the current team (for team 10, you would grab 10 pens).

Next, you would grab a number of pencils corresponding to the proposed move (your randomly selected proposal would take you up a floor, so starting back at the bottom with team 1). So you would have 10 pens and 1 pencil.

If you have more pencils than pens, you will always move to the proposed floor.

If you have more pens than pencils, you set down a number of pens equal to the number of pencils and put them in a drawer.

You reach back into the drawer to randomly select a pen or a pencil.

Your selection decides where you go – pen is stay and pencil is go!

Over 1000 moves, here is what our chain would look like:

Our markov chains work in a similar way, but can also be conceptualized as a birth-death process.

Basic Steps

  1. Draw a process flow map.

  2. Obtain data.

  3. Input the model and the data.

  4. Validate the model.

  5. Experiment with the simulation.

  6. Analyze the results.

  7. Profit!

Distributions

Our input data generally takes the form of a distribution with some known properties.

Normal Distribution

For our normal distribution, we know the \(\mu\) and \(\sigma\).

Exponential Distribution

We can only know one thing about the exponential distribution: \(\mu\)

Poisson Distribution

The poisson is an interesting distribution – it tends to deal with count-related variables. It tells us the probability of a count occuring. We know its \(\lambda\).

Uniform Distribution

While people tend to think about the Gaussian distribution as the most vanilla of all distributions, it really is not – I would say that distinction belongs to the uniform distribution. We don’t even get any fancy Greek letters, just a minimum and a maximum. Why? Because knowing the min and max will tell us that there is an equal probability of drawing a value anywhere within that range.

Using SimQuick

Process Flow Map

How does the flow for a bank typically look:

  1. A customer enters the bank through the door.

  2. The customer will either go directly to a teller or wait in line for the teller.

  3. Teller will serve the customer.

  4. The customer has been served and will exit the bank.

SimQuick has five elements to model a process: Entrance, Exit, Work Station, Buffer, Decision Point. We need to map those elements onto our bank example:

Data

We need to know a few things: how long does it take to serve a customer, how much time between customer arrivals (the interarrival time), and the capacity of the line.

Performance

The service level for each simulation is the fraction of the demand that is satisfied.

\[ Entrance \: Service \: Level = \frac{Objects \: Entering}{Objects \: Entering + Objects \: Unable \: To \: Enter}\]

The overall mean service level of the process is the mean of the service levels calculated from each simulation.

The mean cycle time at a buffer is the mean amount of time an object takes to move through the buffer during a simulation.

The overall mean cycle time at a buffer is the mean of the mean cycle time of the buffer for each simulation.

An Example For SimQuick

The Bank

The interarrival times for a customer follows an exponential distribution with \(\mu = 2 \, minutes\).

The line in the bank holds 8 people. If a person arrives when the line is full, that person will not get in line.

The teller’s service time can be approximated by a normal distribution with \(\mu = 2.4 \, minutes\) and \(\sigma = .5 \, minutes\).

System Improvements

Automated Teller

If we add an automated teller, there is evidence that service time per customer would decrease to \(\mu = 2 \, minutes\)

Additional Teller

Adding the automated teller only slightly changed our system, but adding another teller (i.e., work station) will require us to create a new process flow:

Airport Security

We need to add a decision point to our model.

Inventory

We can extend our conceptual notion of a queue to items in storage (think about a product, stored on a shelf, just waiting to be used). Inventory decisions are generally guided by 3 factors:

Bread Delivery

Let’s look at an example of an order-up-to policy:

We have some considerations to make:

  1. We need to be able to serve the people who want a specific type of bread (our service level).

  2. We don’t want too much space tied up with bread that we don’t need.

  3. We want to achieve a 99% service level.

In terms of our process diagram, we have the following:

Foray Into R

SimQuick has given us an excellent bearing on how to perform a process simulation in R. While the mechanisms change, the concepts still apply.

We can start to replicate the vanilla bank example. The first thing we will do is to create our process flow map with DiagrammeR:


install.packages("DiagrammeR")

It might look weird at first, but it becomes pretty “plug-and-play” once we see what is happening:


library(DiagrammeR)

grViz("
digraph {
  graph [overlap = true, fontsize = 10, rankdir = LR]
  
  node [shape = box, style = filled, color = black, fillcolor = aliceblue]
  A [label = 'Door']
  B [label = 'Line']
  C [label = 'Teller']
  D [label = 'Served Customer']

  A->B B->C C->D
}
")

The Bank

We will need the simmer package for our simulation:

Once we have simmer installed, we need to load it:


library(simmer)

Let’s start by defining a customer’s trajectory. First, we will provide a name for trajectory().


customer = trajectory("Customer path")

Next, we need to initiate a start time with set_attribute() – we will use now() to specify our not-yet-created bank object.


customer = trajectory("Customer path") %>% 
  set_attribute("start_time", function() {now(bank)})

After establishing our time, the next step for a customer is to seize() the “teller” (which we will define later).


customer = trajectory("Customer path") %>% 
  set_attribute("start_time", function() {now(bank)}) %>%
  seize("teller")

Now things start to get tricky. We need to use timeout() to specify how long a customer is using the teller – this is the teller’s average working time.

We can specify how long a teller is seized (i.e., how long the teller is working) in very much the same way we would in SimQuick – we provide a distribution with the appropriate values.


customer = trajectory("Customer path") %>% 
  set_attribute("start_time", function() {now(bank)}) %>%
  seize("teller") %>% 
  timeout(function() {rnorm(n = 1, mean = 2.4, sd = .5)})

After a customer spends time with the teller, the customer releases the counter.


customer = trajectory("Customer path") %>% 
  set_attribute("start_time", function() {now(bank)}) %>%
  seize("teller") %>% 
  timeout(function() {rnorm(n = 1, mean = 2.4, sd = .5)}) %>% 
  release("teller")

This is all we need to do for a customer, so now we can turn our attention to the bank.

Our bank is going to provide the environment that houses our trajectory. So, we can start by creating an environment with simmer():


bank = simmer("bank")

Once we have our simulation environment defined, we can add resources to it with the aptly-named add_resources() function. This is where we will specify what is being seized by our customer. We need to provide some additional information to our resource: capacity and queue_size.


bank = simmer("bank") %>% 
  add_resource("teller", capacity = 1, queue_size = 8)

To this point, we have our customer behavior (how they move through our process) and information about our work stations. The last detail is the inter-arrival time, which we can specify with add_generator(). It works in very much the same way that timeout(), in that we are specifying a distribution. The rexp function in R does not take a mean like it does in SimQuick; instead, it takes a rate. If we remember that, on average, one person comes into the bank every two minutes, we can define our rate as \(\frac{1}{2}\).


bank = simmer("bank") %>% 
  add_resource("teller", capacity = 1, queue_size = 8) %>% 
  add_generator("Customer", customer, function() {
    c(0, rexp(n = 100, rate = 1/2), -1)
  })

Now we can run our simulation; we just need to provide a time value for the until argument.


run(bank, until = 120)

If we put it together, here is what we have:


customer = trajectory("Customer path") %>% 
  set_attribute("start_time", function() {now(bank)}) %>%
  seize("teller") %>% 
  timeout(function() {rnorm(n = 1, mean = 2.4, sd = .5)}) %>% 
  release("teller")

bank = simmer("bank") %>% 
  add_resource("teller", capacity = 1, queue_size = 8) %>% 
  add_generator("Customer", customer, function() {c(0, rexp(100, 1/2), -1)})

run(bank, until = 120)

simmer environment: bank | now: 120 | next: 121.594441054578
{ Monitor: in memory }
{ Resource: teller | monitored: TRUE | server status: 1(1) | queue status: 5(8) }
{ Source: Customer | monitored: 1 | n_generated: 101 }

Finally, we can start to look at our data:


result = get_mon_arrivals(bank)

head(result)

        name start_time  end_time activity_time finished replication
1  Customer0  0.0000000  1.610082      1.610082     TRUE           1
2  Customer1  0.1144445  5.357455      3.747373     TRUE           1
3  Customer2  0.2022296  7.900077      2.542622     TRUE           1
4  Customer3  0.8202680 10.153819      2.253743     TRUE           1
5  Customer4  3.8978576 12.728956      2.575137     TRUE           1
6 Customer14 14.3070146 14.307015      0.000000    FALSE           1

But…that is just one simulation. SimQuick provides an easy way to increase the run count, but R needs something a little different.

We have a few choices. One choice is that we just replicate our procedure a certain number of times:


sim50Runs = replicate(50, expr = {
  customer = trajectory("Customer path") %>% 
    set_attribute("start_time", function() {now(bank)}) %>%
    seize("teller") %>% 
    timeout(function() {rnorm(n = 1, mean = 2.4, sd = .5)}) %>% 
    release("teller")
  
  bank = simmer("bank") %>% 
    add_resource("teller", capacity = 1, queue_size = 8) %>% 
    add_generator("Customer", customer, function() {c(0, rexp(100, 1/2), -1)})
  
  run(bank, until = 120)
  
  result = get_mon_arrivals(bank)
}, simplify = FALSE)

We can also get a bit fancier and really use our machine:


library(parallel)

cl = makeCluster(detectCores() - 1)

clusterEvalQ(cl, library(simmer))

allResults = parLapply(cl, seq(1:50), function(the_seed) {
  set.seed(the_seed)
  
  customer = trajectory("Customer path") %>% 
    set_attribute("start_time", function() {now(bank)}) %>%
    seize("teller") %>% 
    timeout(function() {rnorm(n = 1, mean = 2.4, sd = .5)}) %>% 
    release("teller")
  
  bank = simmer("bank") %>% 
    add_resource("teller", 1, queue_size = 8) %>% 
    add_generator("Customer", customer, function() {c(0, rexp(100, 1/2), -1)})
  
  bank %>% 
    run(until = 120)
  
  result = bank %>%
    get_mon_arrivals %>%
    dplyr::mutate(waiting_time = end_time - start_time - activity_time)
  
  return(result)
  
})

stopCluster(cl)

Adding Teller 2

Adding another teller to our simulation is very difficult:


customer = trajectory("Customer path") %>% 
  set_attribute("start_time", function() {now(bank)}) %>%
  seize("teller") %>% 
  timeout(function() {rnorm(n = 1, mean = 2.4, sd = .5)}) %>% 
  release("teller")

bank = simmer("bank") %>% 
  add_resource("teller", 2, queue_size = 8) %>% 
  add_generator("Customer", customer, function() {c(0, rexp(100, 1/2), -1)})

bank %>% 
  run(until = 120)

result = bank %>%
  get_mon_arrivals %>%
  dplyr::mutate(waiting_time = end_time - start_time - activity_time)

Other Elements

We won’t go over them today, but we can use branch() within a trajectory in the same way that we use Decision Points in SimQuick.

Just like we used Scenario Variables in SimQuick, we could create an omnibus function and insert arguments as needed.

Things You Learn

In your downtime, check this site out: Business Process Analysis in R

Manufacturing

In manufacturing, one of the more important metrics we deal with is throughput (the total number of units produced).

We are also interested in the work-in-process (WIP) inventory.

Variability, while not something we can always measure/predict, can come from a few sources.

Controlling variability can help to increase throughput.

An Example

Improvements

What can we do to improve our throughput?

Working With Simmer Data


library(parallel)

cl = makeCluster(detectCores() - 1)

clusterEvalQ(cl, library(simmer))

allResults = parLapply(cl, seq(1:50), function(the_seed) {
  set.seed(the_seed)
  
  customer = trajectory("Customer path") %>% 
    set_attribute("start_time", function() {now(bank)}) %>%
    seize("teller") %>% 
    timeout(function() {rnorm(n = 1, mean = 2.4, sd = .5)}) %>% 
    release("teller")
  
  bank = simmer("bank") %>% 
    add_resource("teller", 1, queue_size = 8) %>% 
    add_generator("Customer", customer, function() {c(0, rexp(100, 1/2), -1)})
  
  bank %>% 
    run(until = 120)
  
  result = bank %>%
    get_mon_arrivals %>%
    dplyr::mutate(waiting_time = end_time - start_time - activity_time, 
                  runCount = the_seed, 
                  arrival = 1:nrow(.))
  
  return(result)
  
})

stopCluster(cl)

After running 50 simulations, we now have a list of 50 data frames:


rmarkdown::paged_table(allResults[[1]])

We might want to put all 50 of our data frames into one big data frame:


# install.packages("data.table")

allResults = data.table::rbindlist(allResults)

With our data in one table, we can compute some basic stats:


library(dplyr)

runStats = allResults %>% 
  group_by(runCount) %>% 
  mutate(finishNumber = ifelse(finished == TRUE, 1, 0)) %>% 
  summarize(meanActivity = mean(activity_time[which(finished == TRUE)]), 
            meanWait = mean(waiting_time[which(finished == TRUE)]), 
            finishedCount = sum(finishNumber), 
            totalTried = n(), 
            serviceRate = finishedCount / totalTried)

From there, we can use functions like mean() to get the overall means:


mean(runStats$meanWait)

[1] 10.41251

We can also visualize our results:


# install.packages("ggplot2")

library(ggplot2)

allResults %>% 
  filter(runCount < 10) %>% 
  ggplot(., aes(arrival, start_time)) +
  geom_point() +
  facet_wrap( ~ runCount) +
  theme_minimal()


allResults %>% 
  filter(runCount < 10) %>% 
  ggplot(., aes(waiting_time)) +
  geom_histogram(bins = 10) +
  facet_wrap( ~ runCount) +
  theme_minimal()

Linear Programming

Graphical Method

Maximization

\[ \begin{align} P = 4_{x1} + 5_{x2} \\ 2_{x1} + 3_{x2} \leq 120 \\ 2_{x1} + 1.5_{x2} \leq 80 \\ x1, x2 \geq 0 \end{align} \]

The objective function is \(P = 4_{x1} + 5_{x2}\). Every other equation is a contraint. Without constraints, our optimization will do just that – optimize (much like my younger brother, they tend to be lazy and greedy).

If we want to put some concrete terms to our equations, we could specify something as follows:

P is profit (of course we want to maximize profit).

For every piece that machine 1 makes, we make $4. For every piece that machine 2 makes, we make $5

For every piece made, machine 1 requires 2 pounds of filings and machine 2 requires 3 pounds. We currently have 120 pounds of filings.

For every piece made, machine 1 requires 2 pounds of resin and machine 2 requires 1.5 pounds. We currently have 80 pounds of filings.

\(x^1\) is number of pieces for machine 1

\(x^2\) is number of pieces for machine 2

To begin, we need to express our inequalities as equations:

\[ \begin{align} 2_{x1} + 3_{x2} = 120 \\ 2_{x1} + 1.5_{x2} = 80 \\ x1 = 0 \\ x2 = 0 \end{align} \]

This will let us define all ordered pairs that will satisfy the equations.

Our first equation, \(2_{x1} + 3_{x2} = 120\) can be solved as \((2 * 0) + (3 * 40) = 120\) or as \((2 * 60) + (3 * 0) = 120\). For that equation, we have both \((x_1 = 0, x_2 = 40)\) and \((x_1 = 60, x_2 = 0)\).

For our second equation, \(2_{x1} + 1.5_{x2} = 80\), we can solve that as \((2 * 0) + (1.5 * 53.33) = 80\) or as \((2 * 40) + (1.5 * 0) = 80\). For that equation, we have both \((x_1 = 0, x_2 = 53.33)\) and \((x_1 = 40, x_2 = 0)\).

We can plot those points:


library(dplyr)

library(ggplot2)

graphPoints = data.frame(x = c(0, 60, 0, 40), 
    y = c(40, 0, 53.33, 0), 
    eq = as.factor(c(1, 1, 2, 2)))

ggplot(graphPoints, aes(x, y, group = eq)) +
  geom_point() +
  geom_line() +
  theme_minimal() +
  annotate("text", x = 15, y = 50, label = "2_x1 + 1.5_x2 =< 80") +
  annotate("text", x = 50, y = 15, label = "2_x1 + 3_x2 =< 120") +
  labs(x = "x1", y = "x2")

We can start to define the feasible region for our solution:


ggplot() +
  geom_point(data = graphPoints, mapping = aes(x, y, group = eq)) +
  geom_line(data = graphPoints, mapping = aes(x, y, group = eq)) +
  geom_segment(mapping = aes(x = 9, xend = 7, 
               y = 34, yend = 30), arrow = arrow()) +
  geom_segment(mapping = aes(x = 29, xend = 27, 
               y = 15, yend = 11), arrow = arrow()) +
  theme_minimal() +
  labs(x = "x1", y = "x2")

If we consider our constraints, we would probably be working within this region:


ggplot() +
  geom_point(data = graphPoints, mapping = aes(x, y, group = eq)) +
  geom_line(data = graphPoints, mapping = aes(x, y, group = eq)) +
  geom_polygon(data = data.frame(x = c(0, 40, 20, 0), 
                               y = c(0, 0, 26.67, 40)), 
            mapping = aes(x, y), color = "grey", alpha = .1) +
  theme_minimal() +
  labs(x = "x1", y = "x2")

If we want to maximize our function, then we are going to need to look at every ordered pair within the feasible region to get to the optimized solution – that should not take you too long to do by hand…

Or, if you actually want to move along with your life, we can use the extreme point theorem:

Where are our extreme values? Right here!


ggplot() +
  geom_point(data = graphPoints, mapping = aes(x, y, group = eq)) +
  geom_line(data = graphPoints, mapping = aes(x, y, group = eq)) +
  geom_point(data = data.frame(x = c(0, 40, 20, 0), 
                               y = c(0, 0, 26.67, 40)), 
             mapping = aes(x, y), color = "#ff5500", size = 5) +
  geom_polygon(data = data.frame(x = c(0, 40, 20, 0), 
                               y = c(0, 0, 26.67, 40)), 
            mapping = aes(x, y), color = "grey", alpha = .1) +
  theme_minimal() +
  labs(x = "x1", y = "x2")

Now that we know our extreme values, it is just a matter of finding which will give us the maximum value:


extremePoints = data.frame(x = c(0, 40, 20, 0), 
                           y = c(0, 0, 26.67, 40))

x1 = 4

x2 = 5

(x1 * extremePoints$x) + (x2 * extremePoints$y)

[1]   0.00 160.00 213.35 200.00

We can see that a solution with 20 pieces on machine x1 and 26.67 pieces on machine x2 yield the most profit. We know that we probably won’t want to make an incomplete piece, so we can see what happens if we round that down:


extremePoints = data.frame(x = c(0, 40, 20, 0), 
                           y = c(0, 0, 26.67, 40))

x1 = 4

x2 = 5

(x1 * extremePoints$x) + (x2 * extremePoints$y)

[1]   0.00 160.00 213.35 200.00

library(linprog)

c = c(4, 5)

b = c(120, 80)

A = rbind(c(2, 3), c(2, 1.5))

res = solveLP(c, b, A, maximum = TRUE)

res


Results of Linear Programming / Linear Optimization

Objective function (Maximum): 213.333 

Iterations in phase 1: 0
Iterations in phase 2: 2
Solution
      opt
1 20.0000
2 26.6667

Basic Variables
      opt
1 20.0000
2 26.6667

Constraints
  actual dir bvec free     dual dual.reg
1    120  <=  120    0 1.333333       40
2     80  <=   80    0 0.666667       20

All Variables (including slack variables)
        opt cvec   min.c    max.c      marg marg.reg
1   20.0000    4 3.33333 6.666667        NA       NA
2   26.6667    5 3.00000 6.000000        NA       NA
S 1  0.0000    0    -Inf 1.333333 -1.333333       40
S 2  0.0000    0    -Inf 0.666667 -0.666667       20

Minimization

If our object function is to minimize, we need to tweak things just a bit. Consider the following:

\[ \begin{align} C = 50_{x1} + 20_{x2} \\ 2_{x1} - 1_{x2} \geq 0 \\ 1_{x1} + 4_{x2} \geq 80 \\ .9_{x1} + .8_{x2} \geq 40 \\ x1, x2 \geq 0 \end{align} \]

This isn’t anything we haven’t seen, but it has taken a slightly different form. Before our goal was to maximize profit; now, though, our goal is minimize cost.

We can work through our constraints in the same fashion as before.

\(2_{x1} - 1_{x2} = 0\) has a set of \((x_1 = 0, x_2 = 0) or (x_1 = 30, x_2 = 60)\), \(1_{x1} + 4_{x2} = 80\) has a set of \((x_1 = 0, x_2 = 20) or (x_1 = 80, x_2 = 0)\), and \(.9_{x1} + .8_{x2} = 40\) has a set of \((x_1 = 44.44, x_2 = 0) or (x_1 = 0, x_2 = 50)\).

Why, in the name of all that is holy, would that first equation not reduce down further? If we plot things, we will find out pretty quickly:


graphPoints = data.frame(x = c(0, 30, 0, 80, 44.44, 0), 
    y = c(0, 60, 20, 0, 0, 50), 
    eq = as.factor(c(1, 1, 2, 2, 3, 3)))

ggplot(graphPoints, aes(x, y, group = eq)) +
  geom_point() +
  geom_line() +
  theme_minimal() +
  annotate("text", x = 30, y = 50, label = "2_x1 - 1_x2 => 0") +
  annotate("text", x = 50, y = 12, label = "1_x1 + 4_x2 => 80") +
  annotate("text", x = 30, y = 25, label = ".9_x1 + .8_x2 => 40") +
  labs(x = "x1", y = "x2")

We still want to find our extreme points, but remember that we are looking at a minimized objective function. To that end, our feasible region is going to be in a different place


ggplot() +
  geom_point(data = graphPoints, mapping = aes(x, y, group = eq)) +
  geom_line(data = graphPoints, mapping = aes(x, y, group = eq)) +
  geom_polygon(data = data.frame(x = c(16, 34.32, 80, 80, 30), 
                               y = c(32, 11.42, 0, 60, 60)), 
            mapping = aes(x, y), color = "grey", alpha = .1) +
  theme_minimal() +
  labs(x = "x1", y = "x2")

This will help us to find our extreme points for testing:


ggplot() +
  geom_point(data = graphPoints, mapping = aes(x, y, group = eq)) +
  geom_line(data = graphPoints, mapping = aes(x, y, group = eq)) +
    geom_point(data = data.frame(x = c(16, 34.32, 80), 
                               y = c(32, 11.42, 0)), 
             mapping = aes(x, y), color = "#ff5500", size = 5) +
  geom_polygon(data = data.frame(x = c(16, 34.32, 80, 80, 30), 
                               y = c(32, 11.42, 0, 60, 60)), 
            mapping = aes(x, y), color = "grey", alpha = .1) +
  theme_minimal() +
  labs(x = "x1", y = "x2")

Excellent. Now we can solve for those extreme values and get our solution:


extremePoints = data.frame(x1 = c(16, 34.32, 80), 
                               x2 = c(32, 11.42, 0))

x1 = 50

x2 = 20

(x1 * extremePoints$x1) + (x2 * extremePoints$x2)

[1] 1440.0 1944.4 4000.0

With our cost minimized to 1440, we should select 16 for x1 and 32 for x2.

Just for giggles, we can through our solution back into our original equations to see what comes up:


solutionX1 = 16

solutionX2 = 32

(2 * solutionX1) - (1 * solutionX2) # Equation 1 => 0

[1] 0

(1 * solutionX1) + (4 * solutionX2) # Equation 2 => 80

[1] 144

(.9 * solutionX1) + (.8 * solutionX2) # Equation 3 => 40

[1] 40
Your Turn

Try to find an optimal solution to this problem:

\[ \begin{align} P = 4_{x1} + 3_{x2} \\ 21_{x1} + 16_{x2} \leq 336 \\ 13_{x1} + 25_{x2} \leq 325 \\ 15_{x1} + 18_{x2} \leq 270 \\ x1, x2 \geq 0 \end{align} \]

Simplex Method

The graphical method works, but can become unwieldy as we add constraints.

The Simplex Method is one of those methods that is old (1947), but still still useful. When trying to set an objective function to linear functions, the Simplex Method can be used.

There are several steps and rules to solving a problem with the simplex method. The first step is to convert our constraint inequalities to equations for the purpose of finding basic feasible solutions. This conversion will happen with slack variables and surplus variables. For example, we would take the following inequalities:

\[ \begin{align} P = x1 + x2 \\ 3_{x1} + 2_{x2} \leq 40 \\ 2_{x1} + {x2} \geq 10 \\ \end{align} \]

And convert them to equations with slack variable \(x3\) and surplus variable \(x4\)

\[ \begin{align} 3_{x1} + 2_{x2} + x3 = 40 \\ 2_{x1} + {x2} - x4 = 10 \\ \end{align} \]

But now we arrive at an interesting issue: we have 2 equations (m) and 4 variables (n). Whenver \(n > m\), we have an infinite number of solutions (sound familiar so far?).

With this group of variables and equations, we can apply the basis theorem. The basis theorem tell that for a system of m equations and n variables, where n > m, a solution in which at least n - m of the variables have values of zero at the extreme points. With 4 variables and 2 equations, we have at least 2 variables that are 0. Substituting 0 in for variables will give us an idea of our basic solutions.

If we set x3 and x4 to 0, we get the following 2 equations:

\[ \begin{align} 3_{x1} + 2_{x2} = 40 \\ 2_{x1} + {x2} = 10 \\ \end{align} \]

In the equations above, we could set \(x1 = -20\) and \(x2 = 50\) to solve the equations simultaneously. The negative variable does not work for us, so we need to look at other combinations of 0.

We could also set x1 and x2 to 0 to get the following:

\[ \begin{align} x3 = 40 \\ -x4 = 10 \\ \end{align} \]

Also not going to work! What setting x1 and x3 to 0:

\[ \begin{align} 2_{x2} = 40 \\ {x2} - x4 = 10 \\ \end{align} \]

What is x2 and what is x4?

Now we can do the same thing with every other combination:

x1 x2 x3 x4 objective
0.0 0 40 -10.0 0.0
0.0 20 0 10.0 20.0
0.0 10 20 0.0 10.0
13.3 0 0 16.6 13.3
5.0 0 25 0.0 5.0
-20.0 50 0 0.0 30.0

As a proof of concept, let’s take those point with feasible values and plot them:


data.frame(x = c(0, 0, 13.3, 5), 
           y = c(20, 10, 0, 0), 
           eq = c(2, 3, 4, 5)) %>% 
  ggplot(., aes(x, y)) + 
  geom_point() +
  theme_minimal() +
  labs(x = "x1", y = "x2")

This is just the set-up to starting with the Simplex Tableau. I hate to disappoint, but we are not going to go any further into it.

Let me assuage your heartbreak – it is no longer 1950. While some might yearn for the “good-old-days”, they probably were not that good and we have computers that can take care of these things for us.

Network Models

Purpose

We have already learned about how linear programming can help us to find our objective function. We also talked a little bit about how broadly useful it is to the sciences. One reason that it is so useful is that it can be extended to so many different domains.

Vocab Transportation Communication Water
Product Buses, cars, etc. Messages Water
Nodes Bus stops, Intersections Relay stations Lakes, reservoirs
Arcs/Edges Streets Communication channels pipelines, rivers

Let’s consider the following figure:

Each path through our network has an associate cost:

from to cost
LA Omaha 1
LA Topeka 7
Omaha Chicago 2
Omaha Indianapolis 7
Omaha Topeka 5
Topeka Indianapolis 3
Chicago Topeka 1
Chicago Boston 8
Indianapolis Chicago 3
Indianapolis Boston 2

Using linear programming to solve such network problems is called network flow programming and any such problem can be conveyed as a minimum-cost flow program. For the vast majority of our network problems, we are going to be minimizing our objective function. In our network flow, we have two main flows: inflows and outflows. These two flows are used to produce a sparse matrix of node-arc incidences. In a more straightforward approach, these inflows and outflows are used to satisfy the equation of \(\Sigma x_j - \Sigma x_j = 0\): this is called flow conservation. We can also define a source node by changing that equation to \(\Sigma x_j - \Sigma x_j = e\); alternatively, we can define a sink node with \(\Sigma x_j - \Sigma x_j = -e\)

A Classic Problem

The transportation problem is a classic problem for such network flow problems and we can easily see how we can convert it to a linear programming problem:

\[ \begin{align} M = 35_{x_{11}} + 30_{x_{12}} + 40_{x_{13}} + 32_{x_{14}} + 37_{x_{21}} + 40_{x_{22}} + \\ 42_{x_{23}} + 25_{x_{24}} + 40_{x_{31}} + 15_{x_{32}} + 20_{x_{33}} + 28_{x_{34}} \\ subject \, to \\ X_{11} + X_{12} + X_{13} + X_{14} \leq 1200 \\ X_{21} + X_{22} + X_{23} + X_{24} \leq 1000 \\ X_{31} + X_{32} + X_{33} + X_{34} \leq 800 \\ X_{11} + X_{21} + X_{31} \geq 1100 \\ X_{12} + X_{22} + X_{23} \geq 400 \\ X_{13} + X_{23} + X_{33} \geq 750 \\ X_{14} + X_{24} + X_{34} \geq 750 \\ X_{ij} \geq 0 \end{align} \]

Our Problem

We could solve the classic problem by hand without too much effort (it would just take a little bit of time). More modern problems, though, would be a bit tougher.

Look at this data:

We would be dealing with a large number of equations and people tend not to do too well with large tables of information.

Maybe we could do better with a visualization?

Integer Programming

Purpose

We have already seen integer constraints in our linear programming problems, but we really have not given them much thought beyond that we are forcing integers. While integer programs seem simple on their face, they are actually far more complex than our standard linear programming problems. This is further complicated by integer programming being divided into a few different families: namely mixed integer programming (MIP) and zero-one programming.

MIP problems suggest that some constraints are integer constraints and some are fractional. A zero-one programming problem is the same type of binary constraint that we discussed with our network models.

An Important Aside

We are now in the space where we need to discuss issues related to P versus NP problems. If an answer can be obtained in polynomial time, then it is a P problem.

If a problem cannot be solved in polynomial time, but the solution can be verified in polynomial time, the problem is said to be NP.

Classic Examples

The Knapsack Problem

Let’s transport ourselves back to the days of people riding the rails. If you have your trusty knapsack with you, you are faced with decisions about what to bring based upon the weight that you can carry. Let us say, for instance, that you have 10 objects with various weights and that you can carry any combination of objects so long as you do not exceed 20 pounds. Let’s also assume that you have a personal weight attached to each item, so that some items are more important to you than others. We have a problem where we are trying to fit the most important things to us into our bag, while being constrained by the weight of our bag. The knapsack problem is said to be NP-complete (essentially that no algorithm is both fast and always correct).

Let’s see what it looks like in notation:

\[ \begin{align} maximize\, \sum_{i=1}^nv_ix_i \\ subject\,to\, \sum_{i=1}^nw_ix_i \leq W, \\ x_{i} \in \{0,1\} \end{align} \]

Where n means as many items as possible, v is your value, w is the item weight, and W is the weight capacity.

While it looks simple, it is not. Gone are the days when we could fall back on that nice interior point graphical method we used for our linear programs (there is a graphical method, but it is far more complicated). We could also just go through all of the possible combinations (enumeration), but we are dealing with \(2^{10}=1024\) possible combinations of solutions.

Instead, many packages utilize branch and bound algorithms, where the zero-one constraint is relaxed to two continuous constraints (i.e, \(x\leq1\) and \(x\geq0\)). This then goes through a tree search, but that is beyond the scope of our engagement.

Let’s see a few different ways of handling the problem in R.

ROI


library(ROI)

library(ROI.plugin.glpk)

n = 10 # Setting our number of variables

W = 20 # Our max weight

v = round(runif(n, min = 1, max = 5)) # A random assignment of our values

v

 [1] 2 5 3 3 4 4 4 1 4 4

w = runif(n, min = 1, max = 7) # Random weights

w

 [1] 2.020576 1.341635 2.616387 4.498461 5.177600 3.476726 2.312741
 [8] 1.833463 3.882303 6.222713

constraints = L_constraint(w, "<=", W) # Creating the constraints

model = OP(objective = v, 
            constraints = constraints,
            bounds = V_bound(li = 1:n, lb = rep.int(0, n), ui = 1:n, ub = rep.int(1, n)), 
            types = rep.int("B", n), 
            maximum = TRUE)
model

ROI Optimization Problem:

Maximize a linear objective function of length 10 with
- 10 binary objective variables,

subject to
- 1 constraint of type linear.
- 0 lower and 10 upper non-standard variable bounds.

Before we go on to solving the problem, let’s discuss that bounds argument for a minute. We put a function into the bounds argument: V_bound(). The V_bound function takes li and ui, which specify the variables for which we are setting bounds, and lb and ub, which specify the lower and upper bounds of those variables (0, 1).

We can now solve the problem:


res = ROI_solve(model, "glpk", verbose = TRUE)

<SOLVER MSG>  ----
GLPK Simplex Optimizer, v4.47
1 row, 10 columns, 10 non-zeros
*     0: obj =  0.000000000e+000  infeas = 0.000e+000 (0)
*     9: obj =  2.536034568e+001  infeas = 0.000e+000 (0)
OPTIMAL SOLUTION FOUND
GLPK Integer Optimizer, v4.47
1 row, 10 columns, 10 non-zeros
10 integer variables, all of which are binary
Integer optimization begins...
+     9: mip =     not found yet <=              +inf        (1; 0)
+    16: >>>>>  2.200000000e+001 <=  2.400000000e+001   9.1% (8; 0)
+    20: >>>>>  2.300000000e+001 <=  2.400000000e+001   4.3% (7; 5)
+    25: >>>>>  2.400000000e+001 <=  2.400000000e+001   0.0% (5; 9)
+    25: mip =  2.400000000e+001 <=     tree is empty   0.0% (0; 21)
INTEGER OPTIMAL SOLUTION FOUND
<!SOLVER MSG> ----

solution(res)

 [1] 0 1 1 1 0 1 1 1 1 0

ompr

In an effort to create a “modern” approach to building things in R, the ompr package allows for model building through pipes.

We will need the following packages:


library(dplyr)

library(ompr)

library(ompr.roi)

First, we specify that we are creating a mixed integer model with MIPModel():


model = MIPModel()

Next, we add variables to the model. Again, these represent our 10 possible items:


model = MIPModel() %>% 
  add_variable(x[i], i = 1:n, type = "binary") 

Now we need to specify our objective function:


model = MIPModel() %>% 
  add_variable(x[i], i = 1:n, type = "binary") %>% 
  set_objective(sum_expr(v[i] * x[i], i = 1:n), sense = "max")

That sum_expr function is essentially giving us the same thing as sumproduct.

After setting our objective, we can add our constraints:


model = MIPModel() %>% 
  add_variable(x[i], i = 1:n, type = "binary") %>% 
  set_objective(sum_expr(v[i] * x[i], i = 1:n), sense = "max") %>% 
  add_constraint(sum_expr(w[i] * x[i], i = 1:n) <= W)

We want the combined weight of our items to be less than or equal to W (which is still 20 pounds).

Finally, we can solve the problem with solve_model and use get_solution to see the results:


model = MIPModel() %>% 
  add_variable(x[i], i = 1:n, type = "binary") %>% 
  set_objective(sum_expr(v[i] * x[i], i = 1:n), sense = "max") %>% 
  add_constraint(sum_expr(w[i] * x[i], i = 1:n) <= W) %>% 
  solve_model(with_ROI("glpk", verbose = TRUE))

<SOLVER MSG>  ----
GLPK Simplex Optimizer, v4.47
1 row, 10 columns, 10 non-zeros
*     0: obj =  0.000000000e+000  infeas = 0.000e+000 (0)
*     9: obj =  2.536034568e+001  infeas = 0.000e+000 (0)
OPTIMAL SOLUTION FOUND
GLPK Integer Optimizer, v4.47
1 row, 10 columns, 10 non-zeros
10 integer variables, all of which are binary
Integer optimization begins...
+     9: mip =     not found yet <=              +inf        (1; 0)
+    16: >>>>>  2.200000000e+001 <=  2.400000000e+001   9.1% (8; 0)
+    20: >>>>>  2.300000000e+001 <=  2.400000000e+001   4.3% (7; 5)
+    25: >>>>>  2.400000000e+001 <=  2.400000000e+001   0.0% (5; 9)
+    25: mip =  2.400000000e+001 <=     tree is empty   0.0% (0; 21)
INTEGER OPTIMAL SOLUTION FOUND
<!SOLVER MSG> ----

get_solution(model, x[i])

   variable  i value
1         x  1     0
2         x  2     1
3         x  3     1
4         x  4     1
5         x  5     0
6         x  6     1
7         x  7     1
8         x  8     1
9         x  9     1
10        x 10     0

Crew Scheduling

Excel time.

Nonlinear Programming

What makes a problem nonlinear? Let’s start by demonstrating a linear equation. If we have an equation \(3+2_{x_1}+7_{x_2}\) and supply numbers for x_1 (1:10) and x_2 (11:20), we would get the following line:

It is that same song and dance of for every unit increase in x, we have a proportional increase in y.

Nonlinear functions behave differently. If we take our previous equation and tweak it by adding a higher order term (\(3+{x_1}^3+7_{x_2}\)), we get the following:

We no longer have a proportional increase.

NLP comes when any part of the problem, objective function and/or constraints, is nonlinear.

The Optima Problem

The previous image shows us a really nice 2d version of a nonlinear trend. For the vast majority of our statistical and mathematical models, we should really be observing them in multidimension space. Figuring out why nonlinear problems are so hard to find optimal solutions for make more sense once we map the models to 3 dimensions.

We can see that any 3d surface has multiple local optima, but there can be only one global optima.

In NLP, we might not actually find the global optima; instead, we may find any number of our local optima. Why can’t we find the global optima – our starting point is one source of blame. When we (or our program) selects a starting point within the surface, the algorithms will

Examples

You can breathe a sigh of relief – we won’t be messing around with R for NLP. It is not because R won’t do it (it can and does very well), but it is very coding heavy.